종속변수 | 분류기준 | 알고리즘 |
이산형 | 카이제곱통계량 | CHAID |
지니지수 | CART | |
엔트로피 지수 | C4.5 | |
연속형 | ANOVA F-통계량 | CHAID |
분산감소량 | CART |
> index<-sample(c(1, 2), nrow(iris), replace=T, prob=c(0.7, 0.3))
> train<-iris[index==1, ]
> test<-iris[index==2, ]
> library(rpart)
> result<-rpart(data=train, Species~.)
> plot(result, margin=0.3) # margin으로 그래프 외곽의 여백의 두께를 조정
> text(result)
> pred<-predict(result, newdata=test, type='class')
> table(condition=test$Species, pred)
pred
condition setosa versicolor virginica
setosa 14 0 0
versicolor 0 11 4
virginica 0 1 10
> result
n= 110
node), split, n, loss, yval, (yprob)
* denotes terminal node
1) root 110 71 virginica (0.32727273 0.31818182 0.35454545)
2) Petal.Length< 2.45 36 0 setosa (1.00000000 0.00000000 0.00000000) *
3) Petal.Length>=2.45 74 35 virginica (0.00000000 0.47297297 0.52702703)
6) Petal.Length< 4.75 33 0 versicolor (0.00000000 1.00000000 0.00000000) *
7) Petal.Length>=4.75 41 2 virginica (0.00000000 0.04878049 0.95121951) *